Goto

Collaborating Authors

 Victoria


Assessing LLMs for Front-end Software Architecture Knowledge

arXiv.org Artificial Intelligence

Large Language Models (LLMs) have demonstrated significant promise in automating software development tasks, yet their capabilities with respect to software design tasks remains largely unclear. This study investigates the capabilities of an LLM in understanding, reproducing, and generating structures within the complex VIPER architecture, a design pattern for iOS applications. We leverage Bloom's taxonomy to develop a comprehensive evaluation framework to assess the LLM's performance across different cognitive domains such as remembering, understanding, applying, analyzing, evaluating, and creating. Experimental results, using ChatGPT 4 Turbo 2024-04-09, reveal that the LLM excelled in higher-order tasks like evaluating and creating, but faced challenges with lower-order tasks requiring precise retrieval of architectural details. These findings highlight both the potential of LLMs to reduce development costs and the barriers to their effective application in real-world software design scenarios. This study proposes a benchmark format for assessing LLM capabilities in software architecture, aiming to contribute toward more robust and accessible AI-driven development tools.


Human Latency Conversational Turns for Spoken Avatar Systems

arXiv.org Artificial Intelligence

A problem with many current Large Language Model (LLM) driven spoken dialogues is the response time. Some efforts such as Groq address this issue by lightning fast processing of the LLM, but we know from the cognitive psychology literature that in human-to-human dialogue often responses occur prior to the speaker completing their utterance. No amount of delay for LLM processing is acceptable if we wish to maintain human dialogue latencies. In this paper, we discuss methods for understanding an utterance in close to real time and generating a response so that the system can comply with human-level conversational turn delays. This means that the information content of the final part of the speaker's utterance is lost to the LLM. Using the Google NaturalQuestions (NQ) database, our results show GPT-4 can effectively fill in missing context from a dropped word at the end of a question over 60% of the time. We also provide some examples of utterances and the impacts of this information loss on the quality of LLM response in the context of an avatar that is currently under development. These results indicate that a simple classifier could be used to determine whether a question is semantically complete, or requires a filler phrase to allow a response to be generated within human dialogue time constraints.


Evaluating the Determinants of Mode Choice Using Statistical and Machine Learning Techniques in the Indian Megacity of Bengaluru

arXiv.org Artificial Intelligence

The decision making involved behind the mode choice is critical for transportation planning. While statistical learning techniques like discrete choice models have been used traditionally, machine learning (ML) models have gained traction recently among the transportation planners due to their higher predictive performance. However, the black box nature of ML models pose significant interpretability challenges, limiting their practical application in decision and policy making. This study utilised a dataset of $1350$ households belonging to low and low-middle income bracket in the city of Bengaluru to investigate mode choice decision making behaviour using Multinomial logit model and ML classifiers like decision trees, random forests, extreme gradient boosting and support vector machines. In terms of accuracy, random forest model performed the best ($0.788$ on training data and $0.605$ on testing data) compared to all the other models. This research has adopted modern interpretability techniques like feature importance and individual conditional expectation plots to explain the decision making behaviour using ML models. A higher travel costs significantly reduce the predicted probability of bus usage compared to other modes (a $0.66\%$ and $0.34\%$ reduction using Random Forests and XGBoost model for $10\%$ increase in travel cost). However, reducing travel time by $10\%$ increases the preference for the metro ($0.16\%$ in Random Forests and 0.42% in XGBoost). This research augments the ongoing research on mode choice analysis using machine learning techniques, which would help in improving the understanding of the performance of these models with real-world data in terms of both accuracy and interpretability.


Multi-Stage Graph Peeling Algorithm for Probabilistic Core Decomposition

arXiv.org Machine Learning

Mining dense subgraphs where vertices connect closely with each other is a common task when analyzing graphs. A very popular notion in subgraph analysis is core decomposition. Recently, Esfahani et al. presented a probabilistic core decomposition algorithm based on graph peeling and Central Limit Theorem (CLT) that is capable of handling very large graphs. Their proposed peeling algorithm (PA) starts from the lowest degree vertices and recursively deletes these vertices, assigning core numbers, and updating the degree of neighbour vertices until it reached the maximum core. However, in many applications, particularly in biology, more valuable information can be obtained from dense sub-communities and we are not interested in small cores where vertices do not interact much with others. To make the previous PA focus more on dense subgraphs, we propose a multi-stage graph peeling algorithm (M-PA) that has a two-stage data screening procedure added before the previous PA. After removing vertices from the graph based on the user-defined thresholds, we can reduce the graph complexity largely and without affecting the vertices in subgraphs that we are interested in. We show that M-PA is more efficient than the previous PA and with the properly set filtering threshold, can produce very similar if not identical dense subgraphs to the previous PA (in terms of graph density and clustering coefficient).


Batter up! EA Sports gets back into baseball video games with 'Super Mega Baseball' studio deal

USATODAY - Tech Top Stories

EA Sports has made an acquisition that gets it back into baseball video games. The sports game division of Electronic Arts is adding to its lineup Metalhead Software, a Victoria, British Columbia, studio that makes Super Mega Baseball video games. "Super Mega Baseball 3," released in March 2020, has an arcade look, but "it's a really well-made game," EA Sports executive vice president and general manager Cam Weber told USA TODAY. It plays like a simulation under the hood. One of the largest video game publishers in the U.S., EA posted revenue of $5.5 billion in fiscal year 2020.


AIBA: An AI Model for Behavior Arbitration in Autonomous Driving

arXiv.org Artificial Intelligence

Driving in dynamically changing traffic is a highly challenging task for autonomous vehicles, especially in crowded urban roadways. The Artificial Intelligence (AI) system of a driverless car must be able to arbitrate between different driving strategies in order to properly plan the car's path, based on an understandable traffic scene model. In this paper, an AI behavior arbitration algorithm for Autonomous Driving (AD) is proposed. The method, coined AIBA (AI Behavior Arbitration), has been developed in two stages: (i) human driving scene description and understanding and (ii) formal modelling. The description of the scene is achieved by mimicking a human cognition model, while the modelling part is based on a formal representation which approximates the human driver understanding process. The advantage of the formal representation is that the functional safety of the system can be analytically inferred. The performance of the algorithm has been evaluated in Virtual Test Drive (VTD), a comprehensive traffic simulator, and in GridSim, a vehicle kinematics engine for prototypes.


Artificial intelligence could predict El Niño up to 18 months in advance

#artificialintelligence

The dreaded El Niño strikes the globe every 2 to 7 years. As warm waters in the tropical Pacific Ocean shift eastward and trade winds weaken, the weather pattern ripples through the atmosphere, causing drought in southern Africa, wildfires in South America, and flooding on North America's Pacific coast. Climate scientists have struggled to predict El Niño events more than 1 year in advance, but artificial intelligence (AI) can now extend forecasts to 18 months, according to a new study. The work could help people in threatened regions better prepare for droughts and floods, for example by choosing which crops to plant, says William Hsieh, a retired climate scientist in Victoria, Canada, who worked on early El Niño forecasts but who was not involved in the current study. Longer forecasts could have "large economic benefits," he says.


This 14-year-old made the best Facebook Messenger chatbot - BBC News

#artificialintelligence

Yet despite the promise of a revolution in how we interact with services and companies online, progress has been utterly miserable - the vast majority of chatbots are gimmicky, pointless or just flat out broken. But this week I was given great cause for optimism, in the form of Alec Jones, a 14-year-old from Victoria, Canada. For the past six months, Alec been working on Christopher Bot, a chatbot that helps students keep track of homework they've been given over the course of a week. To set things up, a student shares his or her schedule with Christopher Bot, and from then on it will send a quick message at the end of each lesson asking if any homework had been set. "Do you have homework for maths?" it asked 30-year-old me pretending to be a child for the sake of this piece.


Information Flow and the Distinction Between Self-Organized and Top-Down Dynamics in Bicycle Pelotons

AAAI Conferences

Information in bicycle pelotons consists of two main types: displayed information that is perceptible to others; and hidden information available to individual riders about their own physical state. Flow (or transfer) of information in pelotons occurs in two basic ways: 1) between cyclists within a peloton, which riders exploit to adjust tactical objectives (“intra-peloton”); 2) from sources outside a peloton as it is fed to riders via radio communication, or from third parties (“extra-peloton”). A conceptual framework is established for information transfer intra-peloton and extra-peloton. Both kinds of information transfer affect peloton complex dynamics. Pelotons exhibit mixed self-organized and top-down dynamics. These can be isolated and examined independently: self-organized dynamics emerge through local physical rules of interaction, and are distinguishable from the top-down dynamics of human competition, decision-making and information transfer. Both intra and extra-peloton information flow affect individual rider positions and the timing of their positional changes, but neither types of peloton information flow fundamentally alter self-organized structures. In addition to two previously identified peloton resources for which riders compete - energy saved by drafting, and near-front positions - information flow is identified as a third peloton resource. Also, building upon previous work on peloton phase-transitions and self-organized group-sorting, identified here is a transition between a team cluster state in which team-mates ride near each other, and a self-organized “fitness” cluster state in which riders of near equal fitness levels gravitate toward each other.


Hysteresis in Competitive Bicycle Pelotons

AAAI Conferences

A peloton is a group of cyclists whose individual and collective energy expenditures are reduced when cyclists ride behind others in zones of reduced air pressure; this effect is known in cycling as ‘drafting’. Through drafting cyclists couple their energy expenditures. Coupling of cyclists’ energy expenditures when drafting is the basic peloton property from which self-organized collective behaviours emerge. Here we examine peloton hysteresis, applying the definition used in the context of vehicle traffic in which a rapid deceleration to a high density state (jam) is followed by a lag in vehicle acceleration. Applying a flow analysis of volume (number of cyclists) over time, peloton hysteresis is examined in three forms: one is similar to vehicle traffic hysteresis in which rapid decelerations and increased flow (or density) are followed by extended acceleration periods and reduced flow. In cycling this is known as the accordion effect. A second kind of hysteresis results from rapid accelerations followed by periods of decreasing speeds and decreasing flow. This form of hysteresis is essentially inverse to traffic hysteresis and the accordion effect. We show this form of hysteresis using data from a mass-start bicycle points-race. A third kind of peloton hysteresis occurs when the drafting benefit is minimized on hills and weaker cyclists lose positions in the peloton, while flow/density is retained. A computer simulation shows this hysteresis among two sets of cyclist agents, each with different output capacity and models hysteresis as a peloton transitions from flat topography to a steep incline on which drafting is negligible.